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In quantum theory every state can be diagonalized, i.e. decomposed as a convex combination of 
perfectly distinguishable pure states. This elementary structure plays an ubiquitous role in quantum 
mechanics, quantum information theory, and quantum statistical mechanics, where it provides the 
foundation for the notions of majorization and entropy. A natural question then arises: can we 
reconstruct these notions from purely operational axioms? We address this question in the framework 
of general probabilistic theories, presenting a set of axioms that guarantee that every state can be 
diagonalized. The hrst axiom is Causality, which ensures that the marginal of a bipartite state is 
well defined. Then, Purity Preservation states that the set of pure transformations is closed under 
composition. The third axiom is Purification, which allows to assign a pure state to the composition 
of a system with its environment. Finally, we introduce the axiom of Pure Sharpness, stating that 
for every system there exists at least one pure effect occurring with unit probability on some state. 
For theories satisfying our four axioms, we show a constructive algorithm for diagonalizing every 
given state. The diagonalization result allows us to formulate a majorization criterion that captures 
the convertibility of states in the operational resource theory of purity, where random reversible 
transformations are regarded as free operations. 


1 Introduction 

A canonical route to the foundations of quantum thermodynamics is provided by the theory of majoriz¬ 
ation, used to define an ordering among states according to their degree of mixedness ll59l[60llMl 15811 . 
In recent years, the applications of majorization have seen remarkable developments in the study of 
quantum and nano thermodynamics ll2^l3^l33l[T3]l . The viability of this approach relies heavily on the 
Hilbert space framework, for it is based on the fact that density operators can be diagonalized. Ideally, 
however, it would be desirable to have an axiomatic foundation of quantum thermodynamics based on 
purely operational axioms. 

The problem can be addressed in the framework of general probabilistic theories Il35l l26l |9l IH [T5l 
HU m [37l [Ml El- The first step in this direction is to consider probabilistic theories that satisfy an 
operational version of the spectral theorem, according to which every state can be “diagonalized”, i.e. 
decomposed as a mixture of perfectly distinguishable pure states. At this point there are two options: 
one option is to demand the diagonalizability of states as an axiom. This approach has been adopted in 
Refs. dSIlia, also in relation to the issue of defining majorization in general probabilistic theories. 
The other option is to reduce diagonalization to other operational axioms, which may provide deeper 
insights on the conceptual foundations of quantum thermodynamics. This approach will be the subject 
of the present paper. 
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A diagonalization result from operational principles was proved by D’Ariano, Perinotti, and one of 
the authors in the context of the axiomatization of quantum theory in Ref. ifT^ (hereafter referred to as 
CDP), although the proof therein used the full set of axioms implying quantum theory. In this paper 
we derive the diagonalizability of states from a strictly weaker set of axioms, which is compatible with 
quantum theory on real Hilbert spaces and, with other potential generalizations of quantum theory, such 
as the fermionic theory recently proposed by D’Ariano et al in Refs. |[27ll^ . Our list of axioms consists 
of: 

• two of the six CDP axioms (Causality and Purification); 

• one axiom (Purity Preservation) that is close to the CDP axiom Atomicity of Composition, al¬ 
though not exactly equivalent to it; 

• a new axiom, which we name Pure Sharpness. 

Pure Sharpness stipulates that every physical system has at least one pure effect occurring with unit 
probability on some state. Such a pure effect can be seen as part of a yes-no test designed to check an 
elementary property, in the sense of Piron |[52l . In these terms. Pure Sharpness requires that for every 
system there exist at least one property, and at least one state possessing such a property. Note that 
none of our axioms assumes that perfectly distinguishable states exist. A priori, the general probabilistic 
theories considered here may not contain any pair of perfectly distinguishable states—operationally, this 
would mean that no system described by the theory could be used to transmit a classical bit with zero 
error. The existence of perfectly distinguishable states, and the fact that every state can be broken down 
into a mixture of perfectly distinguishable pure states are non-trivial consequences of the axioms. 

Note that the presence of Purification among the axioms excludes from the start the case of classical 
probability theory. Indeed, the aim of our work is not to provide the most general conditions for the diag¬ 
onalization of states, but rather to derive diagonalization as a first step towards an axiomatic foundation 
of quantum thermodynamics. In particular, we are searching for axioms that capture the characteristic 
traits of quantum thermodynamics, such as the link with the resource theory of entanglement 1211. From 
this point of view. Purification is an almost mandatory choice, in that it sets up a fundamental relation 
between mixed states and pure entangled states. More importantly. Purification is deeply related to the 
thermodynamic procedure that consists in considering the system in interaction with its environment in 
such a way that the composite system is isolated. In this scenario. Purification guarantees that one can 
always associate a pure state with the composite system and that the overall evolution of system and en¬ 
vironment can be treated as reversible. In this way, thermodynamics is reconciled with the paradigm of 
reversible dynamics at the fundamental level. In the concrete Hilbert space setting, the purified view of 
quanfum fhermodynamics has been adopfed in a number of works aimed af deriving fhe microcanonical 
and canonical ensembles l[mi4^l43l[3n[^l5^l^l46l[T^ . an idea fhaf has been recenfly explored also 
in general probabilisfic fheories ||45J|48l- 

After deriving fhe diagonalizability of sfafes, we discuss fhe implications of fhe resulf. In particular, 
we discuss fhe relation of majorizafion, defined in terms of fhe probabilify disfribufions arising from 
diagonalizafion. Combining our axioms wifh an addifional axiom, known as Sfrong Symmefry [T], we 
fhen show fhaf majorizafion completely defermines fhe converfibilify of sfafes in fhe operational resource 
fheory of purify ll2T1l . where random reversible Iransformalions are viewed as free operations. If remains 
as an open quesfion whefher in fhe confexf of our axioms Strong Symmetry can be replaced with a weaker 
requirement ifT^ . 

The paper is structured as follows: in section]^ we introduce the basic framework. The four axioms 
for diagonalization are presented in section and their consequences are examined in section Sec¬ 
tion contains the main result, namely the diagonalization theorem. In section we discuss a number 
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of results that arise from the combination of diagonalization with the Strong Symmetry axiom. Using 
these results, section analyses majorization and its applications to the resource theory of purity. The 
conclusions are drawn in section [S] 


2 Framework 


The present analysis is carried out in the framework of general probabilistic theories, adopting the specific 
variant of Refs. |[T5l[T^[T4l . known as the framework of operational-probabilistic theories (OPTs). OPTs 
arise from the marriage of the graphical language of symmetric monoidal categories HI |2l |22l [241 |55l 
with the toolbox of probability theory. Here we give a quick summary of the framework, referring the 
reader to the original papers and to the related work by Hardy llT/l |38l for a more in-depth presentation. 
A comprehensive review of the OPT framework is presented in the book chapter ifTSl . 

Physical processes can be combined in sequence or in parallel, giving rise to circuits like the follow¬ 


ing 



Here, A, A', A", B, B' are systems, p is a bipartite state, and SS are transformations, a and 

b are effects. Circuits with no external wires, like the one in the above example, are associated with 
probabilities. We denote by 

• St (A) the set of states of system A 


• Eff (A) the set of effects on A 


• Transf (A,B) the set of transformations from A to B 


• A (g) B the composition of systems A and B. 

• ® PS the parallel composition of the transformations szf and PS. 

A particular system is the trivial system I (mathematically, the unit of the tensor product), corresponding 
to the degrees of freedom ignored by the theory. States (resp. effects) are transformations with the trivial 
system as input (resp. output). We will often make use of the short-hand notation {a\p) to denote the 
scalar 

(flip) := (Py^-B, 

and of the notation (a| ^ |p) to mean 


{a\^\p) 





{Z)- 


We identify the scalar (a|p) with a real number in the interval [0,1], representing the probability of a 
joint occurrence of the state p and the effect a in a circuit where suitable non-deterministic elements 
are put in place. The fact that scalars are real numbers induces a notion of sum for transformations, 
whereby the sets St(A), Transf (A,B), and Eff (A) become spanning sets of suitable vector spaces over 
the real numbers, denoted by SIk (A), Transfu (A,B), and Eff® (A) respectively. In this paper we will 
restrict our attention to finite systems, i.e. systems A for which the vector spaces Stu (A) and Eff® (A) 
are finite-dimensional. Also, it will be assumed as a default that the sets St (A), Transf (A, B), and 
Eff (A) are compact in the topology induced by probabilities, by which one has lim,j^.+oc^« = where 
G Transf (A,B), if and only if 


lim {E\^n(P>J^R\p) = {E\^^^R\p) VR,Vp G St(A(8)R),V£ G Eff(B(8)R). 
^2^+00 
















G. Chiribella & C. M. Scandolo 


99 


A test from A to B is a collection of transformations {^■},gx froni A to B, which can occur in an 
experiment with outcomes in X. If A (resp. B) is the trivial system, the test is called a preparation-test 
(resp. observation-test). We stress that not all the collections of transformations are tests: the specific¬ 
ation of the collections that are to be regarded as tests is part of the theory, the only requirement being 
that the set of test is closed under parallel and sequential composition. 

If X contains a single outcome, we say that the test is deterministic. We will refer to deterministic 
transformations as channels. Following the most recent version of the formalism lfT4l . we assume as 
part of the framework that every test arises from an observation-test performed on one of the outputs of 
a channel. The motivation for such an assumption is the idea that the readout of the outcome could be 
interpreted physically as a measurement allowed by the theory. Precisely, the assumption is the following. 


Assumption 1 (Physicalization of readout Ifldl ). For every pair of systems A, B, and every test 
from A to B, there exist a system C, a channel G Transf (A,B ( 8 )C), and an observation-test {c;};gx G 
Eff (C) such that 




V/ G X. 


A channel from A to B is called reversible if there exists a channel from B to A such that 
= and = where J's, is the identity channel on a generic system S. If there exists 

a reversible channel transforming A into B, we say that A and B are operationally equivalent, denoted 
by A ~ B. The composition of systems is required to be symmetric, meaning that A (g) B ~ B (g) A. 

A state X £ St (A) is called invariant if ^Z = for every reversible channel fX. Note that, in 
general, invariant states may not exist. In this paper their existence will be a consequence of the axioms 
and of a standing assumption of finite-dimensionality. 

The pairing between states and effects leads naturally to a notion of norm. We define fhe norm of 
a sfafe p as ||p|| := sup^g^ff^^j The sef of normalized (i.e. wifh unif norm) slates of A will be 

denoted by Sti (A). Similarly, fhe norm of an effecl a is defined as ||a|| := suppgg^^^j (^|p)- The sef of 
normalized effecfs of system A will be denoted by Effi (A). 

The probabilislic slruclure also offers an easy way lo define pure fransformalions. The definition 
is based on fhe nolion of coarse-graining, i.e. fhe operalion of joining Iwo or more oulcomes of a lesl 
info a single oulcome. More precisely, a lesl {‘^■},gx is a coarse-graining of fhe lesl if there 

is a partition {Y,}jgx of Y such lhal % = Lygv, for every i G X. In Ihis case, we say lhal 
is a refinement of {^ j/gx- The refinemenl of a given Iransformalion is defined via Ihe refinemenl of a 
lesl: if is a refinemenl of {^j/gx, then Ihe fransformalions are a refinemenl of Ihe 

Iransformalion 

A Iransformalion hf G Transf (A, B) is called pure if il has only Irivial refinemenls, namely for every 
refinemenl {^ 7 } one has = pj'io, where {p;} is a probabilily dislribulion. Pure Iransformalions 
are Ihose for which Ihe experimenter has maximal information aboul Ihe evolution of Ihe system. We 
denote Ihe sel of pure fransformalions from A lo B as PurTransf (A,B). In Ihe special case of slates 
(resp. effecls) of system A we use Ihe nolalion PurSt (A) (resp. PurEff (A)). The sel of normalized pure 
slates (resp. effecls) of A will be denoted by PurSti (A) (resp. PurEffi (A)). As usual, non-pure slates 
are called mixed. 


Definition 1. Lei p be a normalized slate. We say lhal a slate a is contained in p if we can write 
p = pG -\-{\— p)t, where p G (0,1] and T is anolher slate. 

Il is clear lhal no slates are conlained in a pure slate, excepl Ihe pure slate ilself. Al Ihe opposite side 
Ihere are completely mixed slates ifltil . such lhal every slate is conlained in Ihem. 
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Definition 2. We say that two transformations ,.sf' G Transf (A,B) are equal upon input of the state 
p G Sti (A) if o for every state a contained in p. In this case we will write =p . 

3 Axioms 

Here we present our four axioms for diagonalizing states. As a first axiom, we assume Causality, which 
forbids signalling from the future to the past: 

Axiom 1 (Causality |[T5l [Thil l. The outcome probabilities of a test do not depend on the choice of other 
tests performed later in the circuit. 

Causality is equivalent to the requirement that, for every system A, there exists a unique deterministic 
effect ua on A (or simply u, when no ambiguity can arise). Thanks to that, it is possible to define the 
marginal state of a bipartite state Pab on system A as 



In this case we will also write Pa := TrBPAB^ calling mb as TrB, to remind that the deterministic effect 
acts as the partial trace in quantum theory. We will tend to keep the notation Tr in formulas where the 
deterministic effect is directly applied to a state, e.g. Trp := (m|p). 

In a causal theory (i.e. satisfying Causality), the norm of a state p is simply given by ||p|| = Trp. 
Moreover, observation-tests are normalized in the following way (see corollary 3 of Ref. lITSll l: 
Proposition 1. In a causal theory, observation-test, then = “■ 

Causality guarantees that it is consistent to assume that the choice of a test can depend on the out¬ 
comes of previous tests—namely that it is possible to perform conditional tests liTSlI . Combined with the 
assumption of compactness, the ability to perform conditional tests implies that every state is proportional 
to a normalized state ifTSl . Another consequence is that all the sets St (A), Transf (A,B), and EfF (A) are 
convex. In the following we will take for granted the ability to perform conditional tests, the fact that 
every state is proportional to a normalized state, and the convexity of all the sets of transformations. 

The second axiom in our list is Purity Preservation. 

Axiom 2 (Purity Preservatior[^ Il26l [T^ 1^ 1^ 1. Sequential and parallel compositions of pure trans¬ 
formations are pure transformations. 

We consider Purity Preservation as a fundamental requirement. Considering the theory as an al¬ 
gorithm to make deductions about physical processes. Purity Preservation ensures that, when presented 
with maximal information about two processes, the algorithm outputs maximal information about their 
composition GOl . 

The third axiom is Purification. This axiom characterizes the physical theories admitting a descrip¬ 
tion where all deterministic processes are pure and reversible at a fundamental level. Essentially, Puri¬ 
fication expresses a strengthened version of the prineiple of conservation of information llT7ll20ll . In its 
simplest form. Purification is phrased as a requirement about causal theories, where the marginal of a 
bipartite state is defined in a canonical way. Specifically, we say fhaf a slate p G Sti (A) can be purified 
if Ihere exisls a pure slate 'P G PurSt (A (g) B) lhal has p as ils marginal on system A. In Ihis case, we call 
'P a purification of p, and B a purifying system. The axiom is as follows. 

*The name and the formulation of the axiom adopted here are the same as in Ref. 1201 . The original axiom was called 
Atomicity of Composition ua and involved only sequential composition. Extending the axiom to parallel composition is 
important for our purposes, because it guarantees that the product of two pure states is pure. In the axiomatization of Ref. ca 
this property was a consequence of the Local Tomography axiom, which, instead, is not assumed here. 
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Axiom 3 (Purification II151I16II 1. Every state can be purified and two purifications with the same purifying 
system differ by a reversible channel on the purifying system. 

Technically, the second part of the axiom states that, if'P, 'P' G PurSti (A (g) B) are such that Tre'PAB = 
Tra'P^B’ then 'P^g = (j^ (g) '^b) 'Bab> where is a reversible channel on B. In diagrams. 


/-1 

A 


A 

X-- 

A F 

A 

(V 


fq// 


^ f'P' 


B 


B 


In quantum theory, the validity of Purification lies at the foundation of all dilation theorems, such as 
Stinespring’s Il57ll . Naimark’s ll5TI . and Ozawa’s If50l . In the finite-dimensional setting, these theorems 
(or at least some aspects thereof) were reconstructed axiomatically in lITSl . 

Finally, we introduce a new axiom, which we name Pure Sharpness. This axiom ensures that there 
exists at least one elementary property associated with every system: 

Axiom 4 (Pure Sharpness). For every system A, there exists at least one pure effect a G PurEff(A) 
occurring with probability 1 on some state. 

Pure Sharpness is reminiscent of the Sharpness axiom used in Hardy’s 2011 axiomatization |[T7]| . 
which requires a one-to-one correspondence between pure states and effects that distinguish maximal 
sets of states. 

4 Consequences of the axioms 

4.1 Consequences of Causality, Purity Preservation, and Purification 

Here we list a few consequences of the first three axioms, which will become useful later. 

The easiest consequence of Purification is that reversible channels act transitively on the set of pure 
states (see lemma 20 in Ref. ifTSl l: 

Proposition 2. For any pair of pure states i/r, y/ G PurSti (A) there exists a reversible channel on A 
such that \j/' = \j/- 

As a consequence, every finite-dimensional system possesses one invariant state (see corollary 34 of 
Ref. IH): 

Proposition 3. For every system A, there exists a unique invariant state Xa, which is also a completely 
mixed state. 

Also, transitivity implies that the set of pure states is compact for every system (see corollary 32 of 
Ref. HU). This property is generally a non-trivial property—cf. Ref. [J] for a counterexample of a state 
space with a non-closed set of pure states. 

A crucial consequence of Purification is the steering property. 

Theorem 1 (Steering property). Let p G Sti (A) and let 'P G PurSti (A (g) B) tie a purification of p. Then 
O is contained in p if and only if there exist an effect ba on the purifying system B and a non-zero 
probability p such that 



P 
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Proof. The proof follows the same lines of theorem 6 and corollary 9 in Ref. ifTSll . with the only differ¬ 
ence that here we do not assume the existence of perfectly distinguishable states. In its place, we use 
the framework assumption [T] which guarantees that the outcome of every test can be read out from a 
physical system. □ 

Now we introduce a definition and a proposition which will be used later. 

Definition 3. We say that a state p G Sti (A( 8 )B) is faithful for effects of system A if, for any a, a' G 
Eff (A), we have a = f if 



Proposition 4. A pure state *Pab is faithful for effects of system A if and only if its marginal (Oa on A is 
completely mixed. 

See theorems 8 and 9 of Ref. ||T5]| for the proof. 

Combining Purification with Purity Preservation one obtains the following properties: 

Proposition 5. For every observation-test {cii}i^x t^ore is a system B and a test 

Transf (A,B) such that every sFi is pure and a,- = u^£/i. 

Proposition 6. Let a be an effect such that {a\p) = 1, for some p G Stj (A). Then there exists a trans¬ 
formation on A such that a = ufT and ST =p , where ^ is the identity. 

The proofs of the above propositions can be found in lemma 18 and corollary 9 of Ref. OH. 

Finally, thanks to Purification, proposition [T]becomes also a sufficient conditions for a set of effects 
to be an observation-test (cf. theorem 18 of Ref. ifTSl l. 

Proposition 7. A set of effects is an observation-test if and only ifjfj^i a,- = u. 

4.2 Consequences of all the axioms 

In quantum theory, diagonalizing a state means decomposing it as a convex combination of orthogonal 
pure states, i.e. pure states that can be perfectly distinguished by a measurement. 

In a general theory, perfectly distinguishable states are defined as follows: 

Definition 4. The normalized states {p, } are perfectly distinguishable if there exists an observation-test 
{aj} such that (afpi) = 5ij. {ay} is called perfectly distinguishing test. 

Suppose we know that (a|p) = 1, where a is a pure effect. Then, we can conclude that the state p 
must be pure: 

Proposition 8 . Let a G PurEfFi (A). Then, there exists a pure state a G PurSt(A) such that {a\a) = 1. 
Furthermore, for every p G St (A), if{a\p) = 1, then p = a. 

See lemma 26 and theorem 7 of Ref. |[T 6 l for the proof idea. 

Combining the above result with our Pure Sharpness axiom, we derive the following 

Proposition 9. For every pure state a G PurSt (A), there exists at least one pure effect a G PurEff (A) 
such that (a|o;) = 1. 

Proof. By Pure Sharpness, there exists at least one pure effect aq such that (ao|c«o) = 1 for some state 
«(). By proposition!^ Oq is pure. Now, for a generic pure state a, by transitivity, there is a reversible 
channel fT such that a = Hence, the effect a := is pure and (a|a) = 1. □ 









G. Chiribella & C. M. Scandolo 


103 


The above result will turn out to be useful for the construction of our diagonalization procedure. A 
crucial ingredient in the derivation of the diagonalization theorem is the following 
Theorem 2. Let p be a normalized state of system A and let p* be the probability defined a^ 

p^= max {p e [0,\] : p = pa+ {\ - p) a, a eSti {A)}. 
aePurSti(A) 

Let G PurSti (A(8)B) be a purification of p and let p G Sti (B) be the complementary state of p, 
namely p := TrA'B. Then, there exists a pure state G PurSti (B) such that p = p*j8 + (1 — pfjT for 
some state T G Sti (B). 


Proof By hypothesis, one can write p = + (1 — p*) a, where a is a pure state and a is possibly 

mixed. Let us purify p, and let'T be one of its purifications, with purifying system B. According to the 
steering property, there exists an effect b that prepares a with probability p*, namely 





Let a be a pure effect such that (a|a) = 1. Applying a on both sides of Eq. ([T]l, we get 



= P* ■ 


On the other hand, applying a to the state we obtain 


( 1 ) 



( 2 ) 


where q G [0,1] and j8 is a pure state (due to Purity Preservation). Now if we apply b, we have 

^ _, 

Since {b\fi) G [0,1], we must have q > p*. We now prove that, in fact, equality holds. Let be a pure 
effect such that (f>|j8 j = 1. Applying b on both sides of Eq. Q, we obtain 




By Purity Preservation, b will induce a pure state on system A, namely 



where p G [0,1]. Prom the above equation, we have the inequality q <p. Since by definition we have 
P < P*, we finally gel Ihe chain of inequalifies p* < q < p < p*, whence p^, = q = p. Hence, Eq. Q 
implies fhal Ihe pure sfale arises wilh probabilify p* in a convex decomposition of Ihe sfale p. □ 

^Note that the maximum is well defined because the set of pure states is compact, thanks to transitivity. 
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A similar proof was used in lemma 30 of Ref. |[T6ll in the special case where p is the invariant state, 
and with stronger assumptions, i.e. Ideal Compression, which is not assumed here. 

The effect b that prepares a with probability p* can always be taken to be pure. Indeed, bis a pure 
effect that prepares the pure state a on A with probability p. But since p = p*, then (a|a) = 1. Therefore, 
by proposition]^ a = a. This shows that a can always be prepared with probability p* by using a pure 
effect on B. 

As a corollary we have the following: 

Corollary 1. Let p G Sti (A) be a state and let p G Sti (B) be a complementary state of p. Let p* (p) 
and p* (p) be defined like in theorem^for p and p respectively. Then p* (p) = p* (p). 

Proof. By theorem|^ we know that there exists a pure state G PurSti (B) arising in a convex decom¬ 
position of p with probability p* (p): 

P=P* (P)i3 + (l-p* (p))t, 

where T is another state of system B. Therefore p* (p) > p*{p). By theoremapplied to p, we know 
that there is a pure state a' G PurSti (A) arising in a convex decomposition of p with probability p* (p): 

p = p* (p) a' + (l-p* (p))a', 

where a' G St (A). By definition of p* (p), we have p* (p) > p* (p), whence we conclude that p* (p) = 
P* (P)- □ 

Now we are ready to prove the uniqueness of the pure effect associated with a pure state. The proof 
uses the following lemma (see lemma 29 of Ref. JUl). 

Lemma 1. Let x be the invariant state of system A and let a be a normalized pure state. Then 

Pmax ■= Pa = max{p : 3a,X = pa + {l-p)a} 

does not depend on a. 

Proposition 10. For every normalized pure state ot there exists a unique pure effect a such that (a| Of) = 1. 

The proof is identical to the that of theorem 8 of Ref. ifT^ . even though we are assuming fewer 
axioms. 

We will denote by the unique pure effect associated with the pure state a, namely such that 
(a^|a) = 1. 

We are able to establish a bijective correspondence between normalized pure states and normalized 
pure effects. As a result, we obtain the following corollary (cf. corollary 13 of Ref. lIThl l. 

Corollary 2. For every pair of a,a' G PurEffi (A), there exists a reversible channel ^ on A such that 
a' = afr. 

5 Diagonalization of states 

A diagonalization of p is a convex decomposition of p into perfectly distinguishable pure states. The 
probabilities in such a convex decomposition will be called the eigenvalues of p. Note that, since we are 
assuming the vector space StR (A) to be finite-dimensional, diagonalizations of states will have a finite 
number of terms. Here we are not postulating the existence of perfectly distinguishable pure states, but 
this will be a result of the present set of axioms (see corollary]^. 

The starting point for diagonalization is the following 
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Proposition 11. Consider p = p^a + {l — p^) G, where p^ is defined in theorem^ 'We have {cc^\p) = p*- 
Proof. Let 'Pab be a purification of p. Then, the proof of theorem [^yields the following equality 



= P* (T] 


B 


By applying the deterministic effect on both sides of the above equation, we obtain 


P* = P* = 

This shows that («'|p) = p*. 




□ 


The following proposition enables us to define p* in an alternative, and perhaps simpler, way starting 
from measurements. 

Proposition 12. Let p G Sti (A). Define p* := max^gpurEffi(A) (‘^Ip)- Then p* = p*. 

Proof. By proposition [m clearly one has p* > p*. Since p* is the maximum, it is achieved by some 
a* G PurEffi (A). Therefore, 


P 


* 





where 'Tab is a purification of p. Now, a* prepares a pure state j8* on B with probability q < p* (cf. 
corollary [T]l. 


P = W 


D 




=d 


(r 


-□D =d 


We then obtain p* = q < p*, whence, in fact, p* = p*. 


□ 


The result expressed in proposition [TT] has important consequences about diagonalization. Since 
(a^|p) = p*, if p = p*a + (1 —p*) a, then (o:i|a) = 0, provide(0p* / 1. Besides, if (o:i'|a) = 0, then 
(a^jr) = 0 for any state t contained in a. Asa consequence, we have the following important corollary, 
which guarantees the existence of perfectly distinguishable pure states. 

Corollary 3. Every pure state is perfectly distinguishable from some other pure state. 


Proof. Let us consider the invariant state %. For every normalized pure state a, we have X = pmaxOJ + 
(1 — Pmax) <7 (see lemmawhere a is another normalized state. By proposition [TT| (a^|a) = 0. If a 
is pure, then a is perfectly distinguishable from a by means of the observation-test]a^,M — a^}. If a is 
mixed, than (a^|tp) = 0 for every pure state y/ contained in a. Therefore a is perfectly distinguishable 
from y/ again via the observation-test — a^}. □ 

It is quite remarkable that the existence of perfectly distinguishable (pure) states pops out from 
the axioms, without being assumed from the start. In principle, the general theories considered in our 
framework might not have had any perfectly distinguishable states at all! 

^If pt = 1, then p is pure, and we are done. Therefore, without loss of generality we can assume p* f 1. 
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5.1 The diagonalization theorem 

Theorem 3. In a theory satisfying Causality, Purity Preservation, Purification, and Pure Sharpness 
every state of every system can be diagonalized. 

The proof uses the following lemma, which provides a condition for the perfect distinguishability of 
a set of pure states: 

Lemma 2. If the pure states satisfy the condition = 0 for every j > i, they are perfectly 

distinguishable. 


Proof. By hypothesis, the observation-test |a/,M — o;/1 distinguishes perfectly between a, and all the 
other pure states a,- with j > i. Equivalently, the test distinguishes perfectly between a, and the mixec 
state Pi := As a result, we have the condition (^u — aj\pi^ = 1. Applying proposition jh 

we can construct a transformation = 2 ^-*-, which occurs with the same probability as u — aj, such that 
=p. Jf, and, specifically, 

£^i^aj = aj yj>i. 

Moreover, the transformation never occurs on the state a,-. Let { } be a binary test containing 

the transformation By construction, this test distinguishes without error between the state a, and 
all the states a,- with j > i, in such a way that the latter are not disturbed. Using the tests it is 

easy to construct a protocol that distinguishes perfectly between the states The protocol works 

as follows: for i going from 1 to n — 1, perform the test If the transformation s/i takes place, 

then the state is a,. If the transformation takes place, then perform the test and so 

on. □ 


Proof of theorem^ The proof consists of a constructive procedure for diagonalizing arbitrary states. In 
order to diagonalize the state p, it is enough to proceed along the following steps: 


1. Set Pi = p and = 0 

2. Starting from / = 1, decompose p,- as p,- = + (1 — p*,i) Oi as in theorem|^ and set p,+i = a,-. 

Pi = p*,inyUo(l — p^,j). If = 1, then stop, otherwise continue to the step /+ 1. 


Recall that, at every step of the procedure, proposition 


11 


guarantees the condition 


ajloi 


= 0. Since 


by construction every state a, with j > i is contained in the convex decomposition of a,, we also have 
otl\ai) = 0 for Z > k. Hence, lemma 


implies that the states generated by the first i iterations 

of the protocol, are perfectly distingui^able, for any i. For a finite dimensional system, this means that 
the procedure has to terminate in a finite number of iterations. Once the procedure has been completed, 
the state p is diagonalized as p = Y.iPi^i- D 


Note that the diagonalization procedure in the above proof returns a diagonalization of p where the 
eigenvalues are naturally listed in decreasing order, namely p,- > p;+i for every i. Such an ordering will 
become useful when dealing with majorization. 
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5.2 Unique vs non-unique diagonalization and the majorization criterion 

In quantum theory the diagonalization of every state is unique, up to different choices of bases for de¬ 
generate eigenspaces. Is this property satisfied by the operational diagonalization? In general, it is con¬ 
ceivable that different diagonalization procedures may yield different sets of eigenvalues for the same 
state. On top of that, even our algorithm for diagonalizing states may not yield a single, canonical di¬ 
agonalization. It does when the eigenvalues are all distinct, but the situation may be different when two 
eigenvalues coincide. 

The uniqueness of the eigenvalues of a state is particularly important. In a theory where the diagonal¬ 
ization is not unique any attempt to define entropies from the eigenvalues is in serious danger of failure: 
indeed, the resulting entropies would not be functions of the state, but rather of its diagonalization. At 
this stage it is not clear whether the present set of axioms (Causality, Purity Preservation, Purification, 
and Pure Sharpness) implies that all the diagonalizations of a given state have the same eigenvalues. We 
conjecture that the answer is affirmative and plan to provide a rigorous proof in a forthcoming paper 
1191 . For the moment, in this paper we will prove an intermediate result, showing that the eigenvalues 
are unique if one assumes the Strong Symmetry axiom by Bamum, Muller, and Ududec Q in addition 
to our axioms. 


6 Combining diagonalization with Strong Symmetry 

Strong Symmetry is a requirement on the ability to transform maximal sets of perfectly distinguishable 
pure states using reversible channels. In general, a maximal set is defined as follows: 

Definition 5. Let be a set of perfectly distinguishable states. We say that is maximal if 

there is no state p„+i such that the states {Pi}”=/ are perfectly distinguishable. 

When the maximal set is made of pure states, this definition gives an operational characterization of 
the orthonormal bases of a finite-dimensional Hilbert space. Another operational of characterization of 
them was given in Ref. Il25ll in terms of commutative f-Frobenius monoids. 

With this definition. Strong Symmetry reads 

Axiom 5 (Strong Symmetry [71). The group of reversible channels acts transitively on maximal sets of 
perfectly distinguishable pure states. 

Strong Symmetry implies that all maximal sets of perfectly distinguishable pure states have the same 
cardinality, sometimes referred to as the dimension of the system. We call a system of dimension d a 
d-level system. Note that in a r/-level system, the diagonalizations of a state have at most d terms. 

In the following we present a number of results arising from the combination of diagonalization with 
Strong Symmetry. These results were preliminarily discussed in the master’s thesis of one of the authors 
ll5^ and, more recently, they have appeared independently in Refs. Bdl 1^. The first result is that the 
eigenvalues of the invariant state are uniquely defined: 

Proposition 13. Every diagonalization of the invariant state X = Lf=i Pi^i has pi = for every i. 

Proof. Let be the test that perfectly distinguishes between the states Then, one has 

Pi = {ai\x), for every i. Let us consider all the possible permutations of the pure states For 

instance, if tt G Sj, where Sd is the symmetric group over d elements, we can consider the permuted 
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states j, which are obviously still perfectly distinguishable. By Strong Symmetry, there is a 

reversible channel that implements this permutation, namely Let us apply to %■ 

d d 

X = L = HPi^^U) 

j=i y=i 


Now let us apply a,- to X- We have 

d 

Pi = {ai\x) = Y.Pi^iMj)= P^-Hi)- 
,/=i 

Since this holds for every 7t £ Sj, one has pi = pj for every j. This implies that the eigenvalues are equal, 
therefore Pi = ^- □ 

Proposition [T^ implies that the pure states arising in every diagonalization of the invariant state x 
form a maximal set of perfectly distinguishable pure states. One can wonder about the converse: is it 
true that every maximal set of perfectly distinguishable pure states, combined with equal weights, yields 
the invariant state? In this case, the answer is immediate from Strong Symmetry: 

Proposition 14. Let {Yi}‘i=i be a maximal set of perfectly distinguishable pure states. Then one has 

x = di:U¥i- 

Proof Let us consider a diagonalization of x^ say X = d Strong Symmetry, there is a revers¬ 

ible channel ^ such that ^ <p,- = tj/i for every i. Then we have 

“ !=1 “ 


□ 


So far we have used the diagonalization theorem as a “black box”, without referring to the axioms 
used to prove it. Using the full power of the axioms allows us to prove stronger results. For example, we 
are able to prove that every pure maximal set admits a pure, perfectly distinguishing test: 

Lemma 3. For every pure maximal set the pure effects . form an observation-test, which 

distinguishes perfectly between the states 


Proof Let us consider the pure maximal set By proposition 14 we know that X = dYd=\ ^i- 


Let us prove that the perfectly distinguishing test for is pure, namely made of the pure effects 

I a/1 . Recalling lemma 1 each a, arises in the the diagonalization of x with weight P* = ^, and one 

has that = 0 for j i. Therefore = 5ij. We want now to prove that 

observation-test. Thanks to Purification, it is sufficient to show that ctj = u (see proposition 
us consider a purification <I>ab of the invariant state Xa- By theorem]^ one has 


&■ 


IS an 
Let 



1 

d 
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for some set of pure states By theorem]^ we know that a diagonalization of the complementary 


state is p = ^ lli=i ^i- Hence, we have the equality 



A 




<I> 

B 





= I P 


= <D 


A 


{jD 


where the last equality follows from the definition of the complementary state p. Since Xk is completely 
mixed, <I>ab is faithful for effects of system A (by proposition]^. Therefore we conclude that a] = 

u, thus proving that | Of/1. is an observation-test. □ 

The above lemma allows us to prove an important result, which will be essential for the theory of 
majorization discussed in the next section. The result is the following: 

Lemma 4. Let {Yi}‘i=i be two maximal sets of perfectly distinguishable pure states. The 

matrix with entries is doubly stochastic^ 


Proof Clearly [WilVjj ^0 because is a probability. Let us calculate By 

d 


lemma 3 we know that 




is an observation-test and therefore 




d 

I 

!=1 

because the (py’s are normalized. On the other hand, we know that the invariant state can be decomposed 
as 


1 


1 




“L 




(cf. proposition [T4|). Hence, we have 

L (v^'l'P;) =d{¥j\x) =d-^ 


= d-- = l. 
d 


This proves that the matrix with entries is doubly stochastic. □ 

Double stochasticity will be the key ingredient for the results of the following section. 


7 Majorization and the resource theory of purity 

Majorization is traditionally used as a criterion to compare the degree of mixedness of probability dis¬ 
tributions. Here we extend this approach to general probabilistic theories satisfying our axioms and, 
provisionally. Strong Symmetry. In order to define the degree of mixedness operationally, we adopt the 
resource theory of purity defined in our earlier work ll^ . which considered the situation where an exper¬ 
imenter has limited control on the dynamics of a closed system. In this scenario, the set of free operations 
are the Random Reversible (RaRe) channels, defined as random mixtures of reversible transformations: 


"^See chapter 2, A.l of Ref. m for the definition of doubly stochastic matrix. 
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Definition 6. A channel M is RaRe if there exist a probability distribution {P(};gx ^ reversible 

channels such that ^ 

By definition, RaRe channels cannot increase the purity of a state. If p = where is a RaRe 

channel, we say that p is more m/xeii0than a ||2T1. If p is more mixed than a and a is more mixed than 
p we say that p and a are equally mixed. 

Like in all resource theories, it is important to devise some methods capable of detecting the convert¬ 
ibility of states under free operations Il23]l . which gives the (pre)ordering of states. We will now show 
that, under the assumptions made so far in our paper, the ordering of states according to their mixedness 
is completely determined by majorization, just as it happens in quantum theory |[49l . Let us start by 
recalling the definition of majorization: 

Definition 7. Let x and y be vectors in with the components arranged in decreasing order. Then, x 
is majorized by y (or y majorizes x), and we write x A y, if 

• < iLi y*’ for every k = \,...,d-\ 

• i:Uxi=j:Uyi. 

It is known that x A y if and only if x = Py, where P is a doubly stochastic matrix Il34ll4^ . 

Thanks to the results proved in the previous section, we are now in the position to show that major¬ 
ization of the eigenvalues is a necessary condition for the mixedness ordering of two states: 

Theorem 4. In a theory satisfying Causality, Purity Preservation, Purification, Pure Sharpness, and 
Strong Symmetry, let p and O be two states of a generic system and let p and q be the vectors of the 
eigenvalues in the diagonalizations of p and o. If p is more mixed than o, then p A q. 

Proof If p is more mixed than a, by definition, we have p = where {A^} is a probability 

distribution and is a reversible channel, for every k. Suppose p = L/=i PjWj ^rid a = L/=i dj9j are 
diagonalizations of p and a. Then, p = Y,k^k^k<^ becomes 


d 


d 


^ Pj¥j qj^ktpj- 

;=1 k j=l 


By applying we get 

d 

Pi= HdjY.^k 

,/=l k 

This expression can be rewritten as p,- = L/=i Pijdj^ where 



Pij ■■= 



Now, |<Py) is a doubly stochastic matrix because ^ is a maximal set of perfectly distin¬ 

guishable pure states. Since the set of doubly stochastic matrices is convex Il44ll . Pij is a doubly stochastic 


matrix, whence the thesis. 


□ 


As a corollary, we prove the desired result about the uniqueness of the eigenvalues. 


^The same notion appeared in Ref. (m, where it was used to identify which states are better indicators of spatial directions. 
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Corollary 4. In a theory satisfying Causality, Purity Preservation, Purification, Pure Sharpness, and 
Strong Symmetry, all the diagonalizations of a given state have the same eigenvalues. 

Proof Let p = Yfi=\ Pi¥i p = Ly=i dj9j be two diagonalizations of a generic state p, and let p and 
q be the corresponding vectors of eigenvalues. Trivially, p is more mixed than p, which implies p ^ q, 
but also q ^ p, therefore p = nq, for some permutation matrix IT Ii44l . This means that p and q differ 
only by a rearrangement of their entries, whence the eigenvalues of p are uniquely defined. □ 

In Ref. BTl Muller and Masanes proved that two states that are equally mixed (in our terminology) 
differ by a reversible channel. For theories satisfying the axioms adopted in this paper, majorization 
provides an alternative proof: 

Proposition 15. In a theory satisfying Causality, Purity Preservation, Purification, Pure Sharpness, and 
Strong Symmetry, two states p and O are equally mixed under RaRe channels if and only if p = <y,for 
some reversible channel . In particular, two equally mixed states must have the same eigenvalues. 

Proof. Sufficiency is straightforward. The proof of necessity is close to the proof of corollary If p 
is equivalent to a, then p ^ q and q ^ p, where p and q are the vectors of the eigenvalues of p and a 
respectively. This means that p and a have the same eigenvalues (see above). Thus, p = and 

a = pitpi. By Strong Symmetry, there exists a reversible channel ^ such that ft'(pi = Yi for every 
i. Therefore, 

d d 

f/o = Y^pi'^tpi = Y^PiYi =p- 
!=1 (=1 

□ 

We conclude this section by providing a complete equivalence between majorization and the mixed¬ 
ness relation. While in theorem|^we proved that majorization of the eigenvalues is a necessary condition 
for the mixedness ordering, we now show that majorization is also sufficient: 

Theorem 5. In a theory satisfying Causality, Purity Preservation, Purification, Pure Sharpness, and 
Strong Symmetry, let p and o be two states of a generic system and let p and q be the vectors of their 
eigenvalues respectively. Ifp<({, then p is more mixed than o. 

Proof. If p ^ q, one has p = Pq for some doubly stochastic matrix P |[34l l44l . Now, by Birkhoff’s 
theorem lfT0ll44l . P = where the IT^’s are permutation matrices and {Xk] is a probability dis¬ 

tribution. Therefore p = Y,k^kf^kt\.\ specifically, this means that p, = Y,k^kYfj=\ ^k\ijqj- Therefore, we 
have 

d d d 

P = Y \^k]ijq}Yi = 

i—l i—\ k 7—1 

= (3) 

k i=l i=l 

Now, Ya=\ ^k\ijYi is ^ pure state, given by Y 7 Ct{i) for a suitable permutation itk G Sd. By Strong Sym¬ 
metry, the permutation Ttk is implemented by a reversible channel Moreover, Strong Symmetry 
implies that there exists a reversible channel f/ such that f/(pi = Yi for every i G {1,... ,<i}. Defining 
ftlk'.= ^kf^I,'we then have 

d 

%(pi = = ¥Mi) = Y \^k]ij¥u 

i=\ 
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which combined with Eq. Q yields 

d 

P = L 

k i=i k 


Hence, p is more mixed than a. □ 

8 Conclusions 

In this work we have derived the diagonalization of states from four basic operational axioms: Causality, 
Purity Preservation, Purification, and Pure Sharpness. Our result has several applications: first of all, 
it allows one to import all the known consequences of diagonalization in the axiomatic context, such 
as those presented in Ref. |(71, where diagonalization was assumed as Axiom 1. For example, adding 
Strong Symmetry, we obtain that the state space is self-dual—a property that plays an important role in 
the reconstruction of quantum theory IQ. The combination of our four axioms with Strong Symmetry 
leads to important consequences, such as the fact that the eigenvalues in the diagonalization of a state are 
uniquely determined. While our results use Strong Symmetry, it remains as an open question whether 
this requirement can be dropped or replaced by other, weaker requirements. We conjecture that this is 
indeed the case, and we plan to investigate the issue further in a forthcoming paper ifTOll . 

Another important application of our results is in the axiomatic reconstruction of (quantum) ther¬ 
modynamics. In a previous work Il2ll . we defined an operational resource theory of purity—dual to the 
resource theory of entanglement—in which free operations are random reversible channels. A natural 
application of the diagonalization theorem is the formulation of a majorization criterion capable of de¬ 
tecting whether a thermodynamic transition is possible or not, and to establish quantitative measures of 
mixedness lf54l . Specifically, when Strong Symmetry is added to our axioms, the ordering of states in 
the operational resource theory of purity is completely characterized by the majorization criterion. Such 
an application contributes also to the difficult problem of finding the right requirements that guarantee a 
well-behaved notion of entropy in general probabilistic theories l|5]|56j@0]|. To some extent, our results 
suggest that having a sensible notion of entropy (and therefore having a sensible thermodynamics) is 
not a generic feature of general probabilistic theories, but rather a quite stringent constraint. In addition 
to the application to the axiomatization of quantum thermodynamics, it is our hope that this work will 
contribute to the development of an axiomatic approach to information theory—in particular including 
data compression and transmission over noisy channels. 
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